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(54) Technique for embedding a code in an audio signal and for detecting the embedded code 

(57) A code is embedded Into an audio product so 
as to be processed therewith for recording and/or 
broadcast and yet be reliably detected while remaining 
inaudible to human perception. The code is represented 
by symbols formed from an impulse function having its 
energy within a specified frequency range. The audio 
product is analyzed to find segments which can mask 
the code based on tonality and a minimum signal 
energy. When the audio product with an embedded 
code is detected, decoding thereof involves fending can- 
didate code signals which are checked against preset 
criteria. In particular, each symbol is made of at least 
two impulse functions with a preset spacing therebe- 
tween. 
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Description 

[0001] The invention is directed to an improved technique for coding an autio ©goal and, in particular, to embedding 
a code into an audio signal so that a decoder can detect the code reliably despite signal degradation. 
[0002] Audio signals are generated in a variety of ways, such as by radio and television stations, and transmitted in 
various ways, such as by means of airwaves, cable and satellite as well as distributed on magnetic tape and storage 
disc (e.g. optic, magnetic) media. Various benefits are derived from identifying these audio signals which constitute 
"audio products" in the form of programs or comm er cials, for example. The audio products can be broadcast by radio, 
television or cable stations and/or stored on tape, CD-ROM or other media for replay by the consumer. By being able to 
automatically distinguish one audio product from another, it becomes possible to perform a variety of services. For 
example, air time verification is possible to verify for an advertiser that a commercial has actually been broadcast and 
that it was aired in its entirety, at the proper time and in the locations that were paid for. In addition, performance royalty 
revenues can be more accurately calculated based on the frequency with which a piece of music say, has been broad- 
cast For these and other reasons, It is highly desirable to know when a particular audio product has been "performed" 
in the sense that it has been heard by any mernber(e) of a listening audience. Furthermore, the listening (or watching) 
audience can be measured by having individual members or individual households equipped with devices capable of 
identifying certain designated audio products, and then processing the resultant data. This can help measure the pop- 
ularity of a program so that its value to advertisers can be assessed. Also, the exposure of an audience to a commercial 
can be measured this way, and such Iriforrnation can be combined with other data to determine the effectiveness of that 
commercial in terms of how well it is remembered and/or the resulting purchases made thereafter. 
[0003] The automation of this idenffiication by the prior art has involved various techniques for embedding a code In 
the audio product The resulting signal is reproduced by, say, a speaker of a radio or television set. The embedded code 
is also reproduced by the speaker so that it can be detected by a sensing device for data storage and/or processing to 
yield the desired information. Various types of encoding schemes are known. However, they have proved to be unsat- 
isfactory for one or more of the following reasons. If the code is easily removable without permission, then the accuracy 
of the desired measurement will obviously be skewed. Therefore, it is Important for the embedded code to be indelible" 
in the sense that it cannot be removed without seriously (or at least noticeably) damaging the audio product Also, the 
code must not create any audfcle deterioration in the quality of the aucfio product itself, La which can be cfiscemed by 
a human listener. Furthermore, the code must have adequate immunity to noise which occurs during the sending, play- 
back and receiving operations of the encoded audio product For example, an audio product is typically exposed to var- 
ious phase shifts and time shifts in the process of being recorded and/or broadcast in addition, the audio product may 
be compressed by a bit rate reduction system based on psychoacoustic compression techniques, such as EUREKA 
147, DOLBY AC3, and MPEG2. (The term "psychoacoustic" has to do with the human audrtory response to a sound 
stimulus.) The code must withstand such processing while still maintaining Its characteristics for enabling It to be relia- 
bly recovered by the decoder while remaining inaudible. Meeting all of these requirements has proven to be too tall of 
an order for the prior art particularly when combined with the need to minimize the complexity of the apparatus and 
method, and to carry out the technique quickly and efficiently. 

[0004] One aspect of the present Invention provides an improved technique for identifying an audio product with an 
embedded code. 

[0005] Another aspect of the present invention embeds the code indelibly. 

[0006] A further aspect of the present invention embeds the code In such a way that it is not discernible to the listener 

when the audio product is reproduced audibly 

[0007] One other aspect provides an Improved encoding technique. 

[0008] Still another aspect of the present invention recovers the embedded code despite signal degradation. 
[0009] Yet another aspect of the present invention recovers the embedded code despite signal compression. 
[001 0] One other aspect of the present invention provides an improved decoding technique. 
[0011] Another aspect of the present invention enables adaptive masking of the code within the audio product 
[001 2] One aspect of the invention provides a method and apparatus for embedding a digital code in a digitized audio 
product by filtering the digitized audio product to a frequency band of interest. A tonality indication is determined for 
each of a plurality of segments of the filtered audio product which indicates the extent to which power is distributed uni- 
formly for frequencies in at least a portion of the band of interest At least a portion of the digital code is inserted into a 
particular segment from the plurality of segments only if the tonality indication indicates a relatively uniform power dis- 
tribution in that particular segment 

[001 3] Another aspect of the present invention is directed to a method and apparatus for embedding a digitized code 
In a digitized audio product by filtering the digitized audio product to a frequency band of interest and providing a coding 
signal derived from a band-limited impulse function with a waveform having its energy confined to and evenly spread 
across at least a portion of the frequency band of interest The cfigitized code is derived from the coding signal, and the 
digitized code Is inserted Into the audio product. 



EP0913952A2 



[001 4] Yet another aspect of the present invention is directed to a method and apparatus for providing a digitized code 
to be embedded in a cfigrtized audio product by providing the digitized code as a series of binary bits, and drvitfng the 
binary bits into groups, each group having a plurality of bits. Coding signals are provided to represent the bits, respec- 
tively A symbol is derived from the cooing signals for each of the groups, each symbol having a plurality of the coding 

5 slgnafe with a preset spacing therebetween. 

[0015] One other aspect of the present invention is directed to a method and apparatus for encoding and decoding a 
digitized code embedded in a digitized audio product by deriving the digitized code in a form of start, data and end sym- 
bol types, each symbol representing a plurality of bits, and each bit being associated with a coding signal of given polar- 
ity. The start type of symbol is generated to consist of a plurality of the coding signals all of which have the same 
re designated polarity. The digitized code is embedded in the digitized audio product. The digitized code embedded En the 
audio product is detected, and the detected digitized code is decoded by determining whether the polarity of the coding 
signals on the start type of symbol is the designated polarity and, if not Inverting the polarity of the coding signals in 
the data and end types of symbols. 

[0016] Still another aspect of the present invention Is directed to a method and apparatus for embedding a digitized 
T5 code in a digitized audio product by identifying segments of the digitized audio product into which the digitized code can 
be embedded based on predetermined criteria. Portions of the tfgitized code are generated for insertion into the seg- 
ments, respectively. The digitized audio product within the Identified segments is removed, except for a predetermined 
small percentage of amplitude, to generate modified segments, and the portions of the digitized code are inserted into 
the modeled segments, respectively. 
20 [Ooi7] A further aspect of the present invention is directed to a method and apparatus for embedding a digitized code 
in a digitized audio product by analyzing the digitized audio product to derive measured values for designated charac- 
teristics thereof. Segments of the digitized audio productare located, based on the derived measured values and a set 
of preselected parameters, into which the digitized code can be inserted so as to be masked. The digitized code is 
Inserted Into the located segments, and a determination Is made whether a degree of masking of the inserted digitized 
25 code meets a predetermined level and, if not, mocfifying values of at least one of the set of preselected parameters. 
Then, the locating and inserting steps are performed again with the modffled values, 

[0018] AstiD further aspect of the present invention is directed to a method and apparatus for embedding a digitized 
code in a digitized audio product by divitfng the digitized code into preselected portions, and representing the portions 
by a plurality of coding symbols, respectively. The spacing of the coding symbols from each other is determined to be 
so used for embedding the digitized code within the audio product so that the spacing is greater than a predetermined min- 
imum, and the coding symbols are inserted within the audio product based on the determined spacing. 
[0019] One further aspect of the present invention is directed to a method and apparatus for decoding an audio prod- 
uct into which a code of digitized coding signals has been embedded by obtaining a digitized audio product and com- 
paring the digitized audio product with a template of a coding signal to identify candidate coding signals based on 

6 shape. Pairs of sequential candidate coding signals are compared with each other based on preselected characteristics 
to identify which ones constitute the coding signals, and then reconstructing the code from the coding signals identified 

by the comparing step. 

[0020] Since the present Invention can be implemented as a computer program running on a conventional computer, 
the present invention can be embocfied as a storage medium containing computer code, and as an electrical signal car- 

40 rying such computer code. Further, the present invention encompasses an audio signal generated with embedded dig- 
ital code as described herein above. 

[0021 ] Embodiments of the present invention will now be described with reference to the accompanying drawings, in 
which: 

4s Ftgure 1 is a general flow chart for the encoder of one embodiment of the invention. 

Figure 2 Is a flow chart for the Analyzer function. 

Figure 3 is a flow chart of the Locator function. 

Ftgure 4 is a flow chart of the Inserter function. 

Figure 5 is a graph of the 2 ping symbol frame. 
so Ftgure 6 is a graph of the 3 ping symbol frame. 

Figure 7 is a low pass filter impulse response tor a fOter such as is used for the Analyzer. 

Figure 8 is a low pass later frequency response tor the filter of Fig. 7. 

Figure 9 is a bandpass fitter frequency response for a filter such as Is used for the Analyzer. 

Ftgure 10 is a bandpass filter impulse response for the titer pi Fig. 9. 
65 Figure 1 1 1s a band pass filter frequency response for a f Dter such as is used for the Inserter. 

Figure 12 is a tend pass filter impulse response tor the filter of Fig. 11 . 

Figure 13 Is a ping waveform spectral content 

Figure 14 Is a ping waveform, time domain. 
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Figure 1 5 is a general flow chart for the decoder of one embodiment of the invention. 
Figure 1 6 is a flow chart for the filtering and normalization functions of the decoder. 
Figure 17 is a flow chart for the ping tdentfication function. 
Figure 1 8 is a flow chart for the symbol identification function. 

5 

ENCODER 

[0022] An embodiment of the invention includes an encoding technique for embedding alphanumeric codes in the 
sound waveform of a sampled audio product This requires a mocffl ication of the sound waveform. However, the encod- 

io ing technique is designed to render the moderation inaudible in the sense that it cannot be discerned by a person with 
normal hearing. The encocfing technique modifies only short portions of the sound waveform (the content of these mod- 
ified portions is called a "symbol" which is used to form a cods) and has no effect on the remaining waveform. Inaudi- 
bility of the symbols is achieved by careful selection of the symbol locations and the manner with which the code is 
Inserted. The coding technique uses only a specific frequency band within the audio frequency range which has been 

15 selected to allow the embedded i n f orm ation to be recovered by a decoder {descrfoed below), even if the signal has 
passed through a low-quafity transmission channel. The technique is also reasonably tolerant of frequency variations, 
such as produced by fluctuations in record/playback tape speeds. 

OVERVIEW 

20 

[0023] A general description of the encoder of an embodiment of this invention is provided in relation to Fig. 1 . The 
original sound waveform of the audio product is received as a serial stream of data and is stored into memory in the 
form of a digital audio Input File 1 , ag. in WAVE format (Multimedia programming interface and data specification No. 
V1.0 from IBM and Microsoft), which can be either mono or sterea This is derived from a digital source of an audio sig- 
25 nal or in a weO known way by digitally sampling an analog audio signal. The sampling frequency is preferably either 
44,1 00 samples/sec. or 48,000 samples/sec. The sampling frequency of 44, 100 samples/sec. Is the standard adopted 
for professional CD recordings and also by some radio stations. The sampfing frequency of 48,000 samplesfcec. is 
used by most rad io stations and also is the standard adopted for d igital TV. The encoded audio Output File 13 produced 
by the encoder Is also stored in the form of a WAVE fDa 
30 [0024] The operation of the encoder is controlled by a numb©" of selectable parameters 8. it may be necessary to 
change these parameters on occasion, for certain types of audio products. For example, audio products produced for 
the motion picture industry to be shown in theatres have a high dynamic range (e.g. 90 do) and high peak to average 
signal raiia whereas the audio products practiced for broadcast transmission have a reduced dynamic range (ag. 40 
db) and a low peak to average signal ratio. Each type requires parameter selection designed for optimal symbol inser- 
ts tion rates. 

[0025] The encoder is preferably implemented to perform three separate functions which are called herein the Ana- 
lyzer, Locator and Inserter functions. Referring to Fig. 1, the Analyzer 3 analyzes the Input File 1 and produces an inter- 
mediate file called Analysis File 5. Tills process of analysis is not affected by the selectable parameters 8. The Locator 
7 reads the Analysis File 5 and decides exactly where and how the symbols should be inserted, so that they are psy- 
40 choacousticaDy wefl masted by the audio product This process is affected by receiving certain ones of parameters 8. 
AOstof symbol description records is written to a file 9, called an Insertion Ffle. The Inserter 1 1 reads the original Input 
Ffle 1 and the Insertion File 9, and implements the symbol insertions described in the tatter. This process is also 
affected by receiving certain ones of parameters 8. The resulting encoded audio Is written to a WAVE format Output File 
13. 

[0026] tn situations where the encoding needs to be tried repeatedly, with different values of the parameters, (such 
as for the adaptive masking described below or for verifying quality performance) the Analyzer 3 only needs to be run 
once, since its operation is not affected by the parameters 8. Since the Locator and the Inserter functions run fester than 
the Analyzer, this design allows such iterative insertion steps to be done efficiently and quickly. 
[0027] The string of data signals constituting the code to be embedded is preferably a group of hexadecimal cfigits, 
so each hexadecimal digit representing four bits, or two insertion symbols. The number of bits in the string must always be 
even, if the number is not a multiple of four, the last two bits are separated by a decimal point and expressed as a cfigit 
in the range 0..3. The order of symbol insertion is left to right in the string, and most significant to least significant within 
each digit. For instance, the string 75E8.2 corresponds to an 18-bit sequence as shown below in Table 1 : 

55 
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Table 1 



7 


5 


E 


8 


2 


01 


11 


01 | 01 


11 


10 


10 | 00 


10 



[0028] A angle V character can be appended to any string up to 64 bits tong. This causes 14 extra bits to be irtemaDy 
calculated and appended to the cade. These bits are check digits generated by a (n+7,n,5) BCH code in the GF(2*) 
70 (quaternary) field, where n is the number of supplied bit-pairs, up to 32. This code can correct ipto2 erroneous sym- 
bols and detect all 3 -error cases. The generation of the check code involves effectively zero-padding the supplied bit 
string on the left to a full 64 bits, performing the check digit calculation on those bits, and then discarding the padding 
bits. In other words, the 14 check digits generated for the string "31E4+" are identical to those generated for 
"00000000000031 E4+°. 

is [0029] The Inserter receives data from both the Input Re 1 and the lns©1ionFile9inordertogenerateOutputFile13. 
[0030] The fcltowir^ table is a fist of the parang 

The parameters are Dsted along with their default values. The meaning and utilization of these parameters Is explained 
below. 

20 - Table 2 



flame 


Unite 


Default 


Description 


MinEnv 


Signal 


1200 


) 


Control the Masking 


MaxRatio 


None 


300 


) 




MinRatio 


None 


0 


) 




MaxSideRatto 


None 


Z5 


) 




MmPuigSpacifig 


Seconds 


0.1 


) 


Control the Spacing 


ProgDiLher 


Seconds 


osns 


) 




PingSpaciagAlpha 


None 


13 


) 




PbgSpacingBcta 


None 


2.4 


) 




BaodRemoveFac 


None 


0.98 


) 


Control the seating and 


PingGatn 


None 


0.4 


) 


insertion of a ping. 


PingGamMode 


None 


1 


) 





«7 [0031 ] The encoder conf ines its operations to a frequency band of approximately 1000 Hz to 5000 Hz because it has 
the predominant amount of audio program energy content and is the least deteriorated by sending, playback and 
receiving. Thus, it is more robust in terms of resistance to Distortion by various effects caused by the processing which 
the audio product must undergo, such as compression. Signal components outside this band are not affected by the 
encoding process, and are largely ignored by the signal analysis process of the Analyzer 3. This frequency band win be 

45 referred to as the "band ot interest 0 . 

[0032] When a synfcoi is Inserted, the sound waveform In the original audio product. In the band of Interest only, Is 
removed for a time period allotted for insertion of the symbol, called a symbol insertion period (used interchangeably 
herein with "segment"). The removal is not complete in the sense that some of the waveform of the original audio prod- 
uct is retained throughout this period, and the removal is implemented with a soft ramp at the beginning and end of the 

so symbol insertion period. The removed audio is replaced by two, or sometimes three band-limited impulse functions, 
spaced at preselected intervals from each other. Data coding is performed by setting the polarity of these impulse func- 
tions. Each impulse function, called a Iping', is an ideal mathematical impulse function to which a steep bandpass fDter 
has been applied. The energy of the resulting waveform is confined to, and evenly spread across, the range of 1485 Hz 
to 3980 Hz. 

ss [0033] Heard alone, these 'ping' symbols sound lite a 'click' or 'pop'. The task of the Locator 7 Is to find segments of 
the audio product in Input File 1 in the band of interest which are sufficiently spectrally rich (as defined below). When a 
ping waveform is added to such a segment, it sufficiently resembles the energy levels that were removed so that the 
modification Is therefore difficult for a person to discern audibly. The magnitude of the inserted synfcol Is scaled to the 
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RMS power of the in-band envelope signal in the symbol insertion period (the term "in-band" refers to the frequency 
range for the output signal of low pass filter step 20 in Fig. 2). This is done for two reasons: firstly, it helps provide a 
consistent 'replacement' power of the removed audio, to reduce the audibility. Secondly, the decoder (as described 
below) uses a normalization (or automatic gain) process which 6caies the signal it receives to provide a uniform RMS 
5 power level in the band of interest By scaling the ping waveforms to the surrounding RMS, the encoder helps ensure 
that the ping waveforms appear in the decoder with a predictable amplitude, namely scaled to the amplitude of the sur- 
rounding audio. The ping symbols are sufficiently structured and unusuaJ that they can be identified by the decoder with 
a high degree of accuracy within the surrounding audio, even after allowing for some distortion in the course of trans- 
mission. 

io [0034] The signal processing generally follows a model wherein a digitized signal is operated on by a signal process- 
ing function, the result being another digitized signal. This signal can then be passed to one or more further processing 
blocks. Generally speaking, a "signal" is a stream of 32-bit floating point numbers, or samples, representing the voltage 
on a hypothetical physical signal at equally spaced points in time. Due to the nature of the processing being performed, 
it is necessary to relate each output point to a specific time point In the Input FDe 1. In other words, ail propagation 

is delays through the signal paths need to be tracked very carefully. This is oompficated by the fact that Afferent signals 
have Afferent sampfing periods, and different signal processing elements have different delay characteristics. AD sam- 
pling periods and delays are expressed In integral multiples of the "fundamental tick", or ffick. The flick period is about 
88.6 nsec. The exact value of this time period depends on the input file sample rate, as explained in detail below. 

20 ANALYZER 

[00351 The Analyzer 3 generates two measurements of the input signal, namely the in-band envelope and the Tonality 
Ratio The in-band envelope is a moving-average measure of the RMS power in the in-band signal. The Tonality Ratio 
is a figure Indicating whether the in-band signal appears to be spectrally rich, i.e. having relatively uniform power 
25 throughout the band, or whether it is tonal, i. e. having a mix of high and low powers. Tonal signals do not mask cficks 
and pops well. 

[0036] The envelope measurement is done via a series of signal processing stages as described below. The process 
is designed to produce a signal averaging envelope in a way which matches what will be done in the decoder for the 
same audio product, as explained in detail below. 

so [0037] As shewn in Rg. 2, step 20 converts the signal from its sampling rate of 44100 or 48000 samples/sec. to a 
sampling rate close to 1 1 025 samples/sea AD later processing is based on the latter signal. Step 20 also appGes a low 
pass fitter which removes aD frequencies above about 4950 Hz so that they have no effect on the encoring process. 
[0038] Proceeding to signal path 22 of Rg. 2, the 1 1025 samples/sec. signal is passed through a bandpass filter 24 
with a frequency range of 980 Hz to 4850 Hz. Then, a power measurement process 26 reduces the sampfing rate by a 

35 decimating factor of 8. Step 26 takes in eight samples from the bandpass signal outputted by step 24 and sums the 
squares of each sample, and divides the result by 8. One result is outputted for each eight input samples. An 8 times 
decimating and low-pass filter 28 is then applied for smoothing the output of step 26. Filter 28 is a 31 tap filter with the 
fixed coefficients set forth in Table & 

40 

Tables 



50 



OFFSET 


VALUE 


OFFSET 


VALUE 


0 


0.00364725799282 


16 


0.0610569733715 


1 


0.0071953136762 


17 


0.0596727757011 


2 


0.0112656333407 


18 


0.0574157779489 


3 


0.0157909231222 


19 


0.0543587323779 


4 


0.0206811320698 


20 


0.0505991154649 


5 


0.0258259243044 


21 


0.0462550108913 


6 


0.0310981949734 


22 


0.0414601700119 


7 


0.0363584938162 


23 


0.0363584938162 


8 


0.0414601700119 


24 


0.0310981949734 


9 


0.0462550106913 


25 


0.0258259243044 
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Table 3 (continued) 



10 



OFFSET 


VALUE 


OFFSET 


VALUE 


10 


0.0505991154649 


26 


0.0206811320698 


11 


0.05435B7323779 


27 


0.0157909231222 


12 


0.0574157779489 


28 


0.0112656333407 


13 


0.0598727757011 


29 


0.0071953136762 


14 


0.0810569733715 


30 


0.00354725799282 


15 


0.0615234375 







[0039] The filtered result from the sum of the squares outputted by steps 26 and 28 is applied to step 29 which cal- 
ls culates the square root and inverts the signal. Step 30 re-inverts the signal generated by step 29 to produce the RMS 
in-band envelope signal 32. Signal 32 provides two envelope points for each of a plurality of periods of predesigned 
duration (e.g. symbol insertion periods), namely the RMS in-band envelope at the beginning and at the centre of such 
period. 

[0040] The resulting signal 32 tracks the RMS level of the in-band signal, with a slow response time to reflect the 
20 response of the human auditory system to perceived loudness variations. The Locator 7 wiO not insert symbols unless 
this envelope level Is above a certain adjustable threshold. Hie principle is that quiet material is more likely to be cor- 
npted by noise in the transmission channel and, therefore, does not provide a safe Insertion point for a symbol. 
[Q041 ] Proceeding to signal path 34 in Fig. 2, the Tonality Ratio is measured via a Fourier-Transform process 36 which 
operates on the signal produced by the step 20. A 1024 samples FFT (Fast Fourier Transform) Is used at spaclngs of 
25 128 samples, on the time domain 1 1025 samplesteea data, which provides a frequency resolution of about 1 1 Hz in 
the frequency domain. The resultant FFT signal Is divided Into 51 2 frequency bins. However, of these only the 320 cen- 
tral frequency bins are used, and these 320 bins are tfvided into 1 6 groups of 20 bins each, per step 38. In step 40, the 
total power is calculated within each group. Also, within each group, a measure of the range of power values present in 
the group Is made. This measure is the ratio of the second -highest bin power divided by the second lowest A weighted 
30 sum of these measures is made across all 16 groups, per step 42, using the individual group powers as weights. The 
result is the Tonality Ratio. 

[0042] The Tonality Ratio tends to be high©- for a tonal signal, i.e. which contains a mix of high and low bin power 
values within each group, and lower for spectrally rich signals, i.e. which have relatively uniform bin power across each 
group. Tests have shown that this measure tends to correlate very weO with the sound's ability to mask the symbols. 
3s Symbols will be inserted in locations where the Tonality Ratio is lower than a preset adjustable Drrtrt (i.e., in a spectrally 
rich sound waveform). Also, the Tonality Ratios generated by the preceding and following symbol insertion periods from 
the FFT analysis operations are compared against a less restrictive limit (i.e., a higher preset limit) than the one used 
for the present symbol Insertion period. This technique ensures that the symbols are well hidden, or masked, In the 
audio product. 

40 [0043] Step 43 determines the total in-band power based on the FFT signal. In particular, as explained below, this 
total power Is a summation of the in-group powers. 

[0044] Thus, for Analyzer 3 as 6hown in Fig. 2, the input is at 48000 or 44100 samples/sec., and the output is a stream 
of records 44, which are stored In the Analysis File 5. Each record contains two envelope points, one Tonality Ratio 
value and one value of the total in-band power from the FFT which are interleaved for a single symbol insertion period. 
45 [0045] Further details regarding the sampling rate will now be provided The system handles different sample rates 
by sample-rate converting the Input File 1 to a common internal sample rate, which Is always very dose to 1 1025 sam- 
ples/sec. The Output File 13 is always at the same rate as the Input Ffle 1, allowing the output to be numerically identical 
to the Input, except where symbol insertion is being dona 

[0046] To simplify the handling of different sample rates, the flick is defined. This is a time period whose exact value 
so depends on the audio sample rata Given f w the sample rate of the Input File 1 , the value n is found, being the closest 
even integer to 11025*1024/1^ 

[0047] The time period 1/rrf w is known as the hJck. This will always be dose to 88.6 rts. The ftickis used as the meas- 
ure of time within the system. AH sampled signals have periods which are multiples of the ftick. For Instance, the low 
pass filter signal at the output of step 20 to which the input is sample-rate converted, has a period of 1024 fticks. Its 
55 sample rate wiQ approximate 1 1025 samples/sec. to a few parts per thousand. The table below shows the rate of flicks, 
and the sample rate of the low pass filter for input rates of 44100 and 48000 samples/sec which is the first signal after 
down-conversloa Most Internal signals have rates equal to this, or decimated by a power of two. 
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Table 4 



Input Sample Rate 


44100 


48000 


n 


256 


236 


Fundamental Tick Rate 


11289600 


11328000 


Low pass filter (Step 20) 


11025 


11062.5 



[0048] The Analyzer has seven data members which are intermecfiate processing results. These members are 
described below. 



Table 5 



Member 


Sample Period (Funda- 
mental tide) 


Description 


Input File 1 


236 or 256 


Input signal (48000 or 44100 samples/sec) 


Step 20 


1024 


Low-pass and sample-rate converted signal 


Step 24 


1024 


Signal after band-pass filter 


Step 26 


8192 


Signal after power measurement 


Reps 28, 29 


65536 


Signal after filtering of power measurement and inversion 


Step 30 


65536 


Signal after final filter and re-inversion 


Step 36 


65536 


Signal generated by FFT analysis 



[0049] Details of the steps in signal path 34 of Fig. 2 will now be provided. 

[0050] To run the FFT operation of step 36, an FFT sample size of 1024 is read from the output d step 20 and a rosine 
(Von Hamm) window function is applied to the time domain data An input pointer is moved by 128 samples after each 
FFT operation, so there is a substantial overlap between successive FFT operations. Each FFT operation performs the 
fdkwing function: 

1. M real numbers are presented as a waveform to be transformed. M(i.a 1 024) is the FFT size used 

2. The time-domain window is applied to the waveform, the FFT is performed, and then the power for each of fre- 
quency bins from 0 to (M>2)-1 is calculated as the square of the magnitude of the complex FFT result The resulting 
complex spectral image is simplified by retaining the square of the magnitude of each frequency bin and discarding 
the phase information. 

3. W2 real numbers are then retrieved. For M = 1024, these are the power values for frequency bins 0 to 511. 
[0051] This FFT operation can be described mathematically as 



f » 0..^ - 1 



... where x, is the input waveform, M is the FFT size, w(t) is the window function, and F f is the magnitude-squared 
result No scaling Is performed so, for Instance, the output in bin zero is the square of the sum of the M inputs (after 
windowing). 

[0052] Each FFT result Is, thus, obtained In the form of a power spectrum: A set of frequency bins, numbered from 0 
to 51 1 , containing the square of the magnitude of the complex FFT result for the corresponding frequency. The n-th bin 
contains information for a frequency of 
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J1025 



Hz*n • (10.77 Hz) 



5 [0053] In accordance with step 38, the subset of this power spectrum corresponding to the band of interest is divided 
into groups. There are 16 groups, each having 20 frequency bins. The lowest group starts at frequency bin 96. Bins 0- 
95, as well as 437*51 1 , are discarded. So these 1 6 groups cover the frequency range from about 1034 Hz to about 4470 
Hz, each group covering about 215 Hz. 

[0054] Within each group, the following three determinations are made in accordance with step 40: 

io 

• Thesumof the power values in each of the 20 bins (total group power) 

• The second smallest power value in the 20 bins 

• The second largest power value in the 20 bins. 

is [0055] Of course, the second smallest and second largest power values could be replaced by values of other bins, 
the main object being to discard the highest and lowest values as possible s t a ti stical aberrations. For each group, the 
second largest bin power is divided by the second smallest, and then the following non-linear clipping operation is per- 
formed: If the result exceeds 2000, then half the amount by which it exceeds 2000 is deducted. If the result of this oper- 
ation exceeds 4000, then it is set to 4000. This provides a unit! ess measure of the power skew in the group, if the 

20 second smallest bin power is very small (as defined below in the equation for T } ), the power skew tor the group is simply 
set to 4000 to avoid division by a very small number. 

[0056] A weighted average is taken of the square roots of these ratios. The weighting value for each group is the total 
group power. The results provided by step 40 are squared and become the Tonafity Ratio result for the whole FFT oper- 
ation, as per step 42. ItwQ) always be at least one and no more than 4000. If Pj is the total power obtained by summing 
29 the power of the bins in group i, and B| and ^ are the second-largest and seccmd-smaJlest bin powers, the calculation 
can be summarized as 



30 




if BjL ± * 2000 



otherwise 



+1000 if BjL ± > 2000 and * 6000 



55 



40 



45 




SO 



.. where T is the overall Tonality Ratio. 

[0057] There is also an RMS power calculation for the total in -band power which is derived from the FFT signal. The 
calculation of such total in-band power is as below: 
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[0058] The division by 1 024 corrects for power gain in the unsealed FFT operation, and the square root converts from 
a power measure to an RMS measure The factor corrects for attenuation due to the co&ine window. If the input is a sin e 
wave of amplitude A, with a frequency In the band of interest, the resulting P' will be close to 




which is the correct RMS measure for a sine wava 

[0059] A data record 44 corresponds to information about the sound waveform of the audio product for a symbol inser- 
tion period, or about 1 1.6 msec of real-time for a 44,100 samples/sec. rate. 
[0060] Thare are four data values in the record: 

1 . The in -band envelope level (from step 32) at the beginning of the 1 1 .6 msec period. This is in signal units and is 
approximately equal to the RMS. value of the irvband signal. It is rounded and stored as an unsigned 16-bit inte- 
ger 

2. The In-band envelope level (from step 32) at the centre of the 1 1 .6 msec period. Same units and scaling are used 
as for the first envelope point 

3. The Tonality Ratio (from step 42) for an FFT calculation centered about the beginning of the 11.6 msec period. 
This is a unitJess number in the range 1 -4000; it is scaled by 2 s and stored as a 16-bit unsigned number. 

4. The total in-band power (from step 43) based on the FFT signal for a time period centered about the beginning 
of the 1 1 .6 msec period. This is in signal units and is approximately equal to the R.M.& value of the irvband signal. 
It is rounded and stored as an unsigned 16-bit integer. 

Mathematical Backgr ound 

[0061 ] The following description supplies the mathematical background underlying some processes used in the Ana- 
lyzer 3. General methods are given for 

• Sample Rate conversion, by a rational ratio less than unity; 

• Construction of a low-pass, linear phase FIR filter from arbitrary parameters; 

• Construction of a band-pass, linear phase FIR filter from arbitrary parameters; 

[0062] A down-sampling process of step 20 converts the Input File signal to the low pass fitter signal, simultaneously 
reducing the sample rate by a rational factor and removing higher frequencies which cannot be represented in the lower 
sample- rate signal. The Input File 1 is resampled in such a way as to reduce the sample rate by a factor of K/V. K is the 
factor by which the original sample rate is multiplied. V is the decimation factor, ft is assumed that K<V, but the same 
approach works for increasing the sample rate. The ratio must be in lowest terms; i.a K and V have no divisors in com- 
mon. 

[0063] Conceptually, the signal is processed through the following individual steps: 

1. The input signal is caQed X|. 

2. The rate of yj Is increased by a factor of K. giving the signal tfj. This Is done by Inserting K-1 zero samples after 
each sample: 

xj. i is a multiple of K 

K 

otherwise 



3. A low-pass FIR (Finite Impulse Response) filter with impulse response F k of length L- K Is applied to this signal, 
giving the signal yV 
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LK-l 
Jr- 0 



4. The result is dedmated by discarding all samples except those whose indices are a multiple of V, leaving the sig- 
70 nalyi: 

[0064] These steps produce a ratMonverted signal which is as close as possible tothe mathematical ideal. The filter 
75 used in step 3 is ideally a brick-wair tow-pass which removes all frequencies too high to be represented at the new 
sample rate. Since an ideal filter has an infinite impulse response, the FIR can only approximate this ideal. A certain 
amount of signal aliasing wiD occur In step 4, which also arises because of the non-ideal FIR filter. Step 2 reduces the 
amplitude of the signal in the bamlol interest This is compensated in step3byan^gingthef3ter Ftohaveapas^and 
gain numerically equal to K. 

20 [0065] The process described just above can be simplified by capitalizing on the fact that many ol the x*j samples are 
zero and many of the /j samples are discarded. 

[0066] The FIR fitter F k used En step 3 can be rearranged Into matrix form: 



25 



30 



35 



= 0. - .L - 1 

- o. . .ic - i 



so 



it - o Ao 



x*i is zero except when i is a multiple of K, i.a when (i-k-Kj)MOD K = 0 in the equation above. An equivalent condition 
is k = I MOD K . Thus, the outer summation can be eflminatBd, using a single value for the Index variable k: 

40 

L-1 



45 

where ko | MOD K. 
[0067] With this definition of K we have 



so i - * « i - (i MOD JO = 



•HI* 



So, 

ss 



11 



10 



15 



where kofMODK. 
From the definition of x*|, . 



and, from the definition of Yj; 
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L - 1 



L - 1 



20 



L - 1 



25 



[0068] Tliis last equation performs tie whole conversion In a single step. By substituting 



and 



k t = IV MOD K 



30 



or, 



3s the last equation can be rewritten as 



40 



L - 1 



[0069] Each summation operation is equivalent to applying a size L FIR filter to the input signal x starting at offset qj. 
The coefficients of the FIR -its impulse response-are the row of G selected by kj. Since the overall filter F has a gain 
45 of K, each row of Q has a passband gain approximately equal to unity. 

[0070] Assuming that the FIR is a symmetrical, zero phase low pass design, the delay Incurred from signal tfj to Is 
(LK-1 )/2 samples. The other conversions do not introduce delay, but they change the ample rates. The overall delay is 



50 



4 



input samples, or 

55 

LK-1 
2V 
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output samples. 

[W71 ] The rate conversion done on the input sample rate is a reduction by a factor of n/1024, where n is the number 
of flicks in the period of the input audio ffle. Thus, the conversion technique can be appfied with 



5 




v C 

w 

where C is the greatest common factor of 1024 and n. The delay, expressed in output samples is thus 



20 



25 



30 



35 



40 



45 



50 



2 Mil 



or 



nL-C . 
2 • 1024' 

[0072] In terms of fticte, the delay of the conversion process is (nL - c)/2 fecks. Since n is even, C wffl also be even, 
so this wOl be a whole number. 

[0073] A generalized low-pass FIR f Qter with Dnear phase delay can be constructed using the general equation 
where 



L is the length of the ffiter, f is the sampling frequency, fc is the cutoff frequency, and w is a window function which 
applies over the range [OK L - 1]. The Kalser-Bessel window function 



is, 



i - - i) 2 

W * U ' L) = — m 



is useful because the beta value can be adjusted to control the trade off between sharp cutoff and good out-of-band 
rejection for a given filter length. IO(x) Is a modeled Bessel function. 
[0074] All frequencies are delayed A M\- - 1) samples by such a filter. 
55 [0075] It should be noted that slnc(x) needs to be evaluated as 1, or, more accurately, as l-l/BCnx) 2 , for values of x 
close to zero. 

[0076] By multiplying the low-pass filter with the centre frequency of the passband, a generalized band pass FIR filter 
can be realized as 
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(i - i Q ) \w(i,L) 



where 



'o = "2"» sinc(x) 



sin(rcjr) 
nx 



as before; Lis the length of the filter, fs Is the sampling frequency, fw is the width of the pass band, fc Is the centre fre- 
quency of the pass band, and w is a window function. 



[0077] As shown in Fig. 3. the Locator analyses records 44 from the Analysis File 5. As stated above, each record 44 
contains information for about 11.6 msec of the audio product The Locator reads every one of these records in 
sequence. As each record is read, a decision is made as to whether to insert a symbol at that point The Locator is 
divided into three separate processes: 

• The Spacing process guarantees a minimum spacing between consecutive symbols, and prevents symbols from 
being inserted at regular intervals. This process does not examine the Input Re characteristics. 

• The Masking Test process examines the signal characteristics at each possible insertion point, as determined by 
the spacing process, to determine whether a symbol inserted at that point would be masked well. This process 
does not consider the spacing between insertions. 

• The Code Sequencing process Is responsible for sequencing the binary codes represented by the symbols. When 
a decision is made to insert a symbol, the Code Sequencing process determines the type of symbol to be inserted, 
as explained below, and the binary data it is to cany 

[0078J The spacing determination is made first to identify the minimum spacing for locating the next symbol. The 
masking test is not performed until a record 44 is reached for which this minimum spacing requirement is mat 

Spacing Process 

[0079] The ping spacing process of step 60 (Fig. 3) is controlled by the following four parameters: 

• MinPingSpacing (also referred to as "MPS"), expressed in seconds; 

• PingSpacingAlpha and PingSpadngBeta, unrtless ratios; and 

• PingDtther, expressed in seconds. 

[0080] MinPingSpacing is the minimum time between symbol insertions- The other parameters combine to increase 
the minimum spacing in various ways. The earliest time the i-th symbol can be Inserted is given by minimum spacing in 
various ways. The earDest time the l-th symbol can be inserted is given by 



LOCATOR 



t± * Max 



t 



1-2 



+ MinPingSpacing 
+ MinPingSpacing • (1 + 
MinPingSpacing ■ {1 + a 



a) 



+ P) 



+ PingDi thez ■ x L 
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where tj is the insertion time of the i-th symbol 

a = MinPingsparingAIpha 

p » MbiPingspactngBeta 

r t is a random value uniformly distributed in [0,1]. 

Values of tf where i<0, are assumed to be zero. The rj random variable is generated each time a new symbol is inserted, 
and Its effect is to ensure that the interval between symbols is random. 

[0081 ] Typical autfo products contain a continuous sequence of segments which are suitable for symbol insertion. 
This Spacing Process prevents rapid, regularly spaced insertion of symbols, since such are detectable to the ear 
because of their regularity. Selection of a and p is made empirically by listening to their effect on audfcility. 
[0082] The following table has been derived for iDustrative purposes only. The arbitrarily chosen parameter values are: 
MPS=3, a=2 and p=1 . For the sake of simplicity, n is not taken into account in this example. 



Table 6 



Symbol 


t^+MPS 


tH*MPS(1+<x) 


tw+.MPSfl+ct+p) 


Min tj based on ^pac- 
ing Test 


Actual tj based on Mask- 
ing Test 


1 










0 


2 








3 


3 


3 


343=8 


0+3{3)=9 




9 


12 


4 


12+3-15 


3+3(3)- 12 


0+3(4)»12 


15 


16 


5 


18+3=19 


12+3(3)=21 


3+3(4)= 15 


21 


23 


6 


23+3=26 


16+3(3>-25 


12+3(4)=24 


26 


26 


7 


28*3=29 


23+3(3)^32 


16+3(4)=28 


32 


32 


8 


32+3=35 


26+3(3)=35 


2S+3(4)=35 


35 


35 


9 


35+3=38 


32+3(3)=41 


26+3(4)=38 


41 


46 


10 


48*3=49 


35+3(3)=44 


32+3(4)=44 


49 


50 



[0083] Often used values tor these parameters are: MPS = 0. 1 sees, a = 1 .3 sees, p = 2.4 sees. 
[0084] If step 62 indicates that a particular record meets the minimum spacing requirement, then the masking test is 
performed on that record and all subsequent records until a symbol is inserted. Then the ping spacing process of step 
60 is performed again in order for a new minimum spacing to be calculated and appGed. 



Masking Test Process 

[0085] The Masking Test process 64 compares the signal characteristics for records obtained by step 62 from the 
Analysis File 5 against Emits set by the encoder parameters 8. This process is controlled by the following parameters: 

• MmRatio and MaxRatio control the acceptable Emits of the Tonality Ratio. 

• MaxSide Ratio sets the acceptable maximum for the Tonality Ratio in adjacent measurements. The Locator has one 
frame of look-ahead', so it must read the next record before deciding whether to Insert a symbol at the time corre- 
sponding to the present record. 

• MinEnv sets the minimum acceptable value of the signal envelope measurement 

[0086] MinEnv is in signal units; the others are unitless. 

[0087] If the symbol V, represents the Tonality Ratio as retrieved by step 64 from the i-th record in the Analysis File 5, 
and ej is the in-band envelope estimate for the same point in time, then a symbol can be inserted at the corresponding 
time point if 
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10 



v x £ MinRatio 

v i £ MaxRatio 
s MaxRatio 9 MaxSideRatio 
z MaxRatio - MaxSideRatio 
a MinEnv 



are all true. 



"MinEnv" should be set with care; it needs to be adapted to the signal input level in use. rf the signal input level is 
reduced by hall, for instance (down 6db), the encoder wfll perform fewer insertions. Dividing MinEnv in half wfll restore 
15 the original operation. The tonality measurements are not affected by the input lave!. 

[0088] As cfiscussed above, each record 44 has two in-band envelope levels- Both levels must meet the above-stated 
threshold for 0}. 

[0089] The purpose of MinEnv is to prevent the encoder from inserting symbols into portions of the audio product 
which are so quiet that they may be overwhelmed by noise in the transmission channel. 
20 [0090] The other concfitions ensure that the proposed insertion site, and the surrounding area of the signal, are suf- 
ficiently non-tonal or have sufficient psychoacoustics to mask the symbol well. 

[0091] MaxSideRatio ("M8R") is expressed as a ratio relative to MaxRatio ("MR"). Normally MSR/MR is >1 . 
[0092] The closer the Tonality Ratio approaches 1 , the more spectrally rich (in the sense of having substantially uni- 
form power in the band of Interest) Is the audio product MinRatio establishes the noise floor for symbol Insertion. If a 

25 segment is extremely noisy and is then subjected to noise reduction or removal systems, symbols inserted into such a 
noisy segment could be removed and lost through the same process. By defining the MinRatio for Tonality, it can be 
ensured that symbols are acted upon as "real" audio and therefore enhance recovery by the decoder. 
[0093] Additional, more extensive, well known masking parameters can be added which use the signal characteristics 
data obtained from the Analysis file 5 to define temporal masking values of the bandpass signal. This more elaborate 

so type of masking test employs more precise psychoacoustic masking effects of the signal immediately around the sym- 
bol insertion time segment 

[0094] If the masking test is met, as determined by step 66, then step 68 of Fig. 3 is responsible for determining the 
RMS level of the "ping" waveforms required for symbol insertion. This level is calculated based on the parameters Ping- 
Gain and PingGainMode from among parameters 8: 

35 

* If PingGainMode is 1 , the insertion level is PingGain times the in-band envelope level at the insertion time (i.a out- 
put of step 32). 

* If PingGainMode is 2, the insertion level is PingGain times the total in-band power as estimated by the Fourier 
Transform at the insertion time (i a output of step 43). 

40 

[0095] The total in-band power as estimated by the FFT at the time of insertion with step 43, is a more accurate meas- 
ure of the power of the autfo signal than that of the more averaged envelope level outputted by step 32. This higher 
degree of accuracy Is required as more precise psychoacoustic masking decisions are used. This allows tor careful 
control of the amplitude of inserted symbols so that codes are less Gkdy to be lost when psychoacoustic compression 
4s systems are used in the audio delivery chains. 

[0096] The value of PingGain is set empirically based on tests of program types and signal characteristics. 

Code Sequencing 

so [0097] The encoder can insert any binary code, any even number of bits In length. Each symbol carries two bits, 
except for the start and end symbols. 

[0098] The codeword to be encoded, 2n bits in length, is broken up into n 2-bit sections, referred to as "cfits D (for dou- 
ble-bits). The codeword is then encoded using 

55 • A start symbol , which serves to mark the start of the codeword, but carries no information. This also establishes a 
reference in the receiver, indicating whether the channel has inverted the signal. 

* n-1 data symbols, which encode 2 bits each; 

* An end symbol, which encodes the last 2 bits of the codeword and also serves to mark the end of the codeword 
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with the addition erf a conditional ping. 

[0099] The start and end symbols are each coded as 3-pirtg symbols, while the data symbol is a 2-ping symbol. The 
start symbol is coded by the bit pattern TOO', meaning three negative pings. The two pings in a data symbol are coded 
according to the two bits being encoded. In an end symbol, the first two pings are encoded according to the data, and 
the third ping is the inverse^ the second ping of the symbol. 

[0100] The table below shows an example where the 12-bit sequence *01 10 1 100 11 10' is encoded with seven sym- 
bols: 



Table 7 



Bits 


Coded 


Ping Polarities 


Symbol Type 




000 




Start 


01 


01 


- + 


Data 


10 


10 


+ - 


Data 


11 


11 


+ + 


Data 


00 


00 




Data 


11 


11 


+ + 


Data 


10 


101 


+ - + 


End 



[0101] Thus, in step 70 of Fig. 3, the data string of code sequencing determines the present position reached in the 
code to be embedded during the embedding operation of the encoder, as shown above tn T&Ue 1 . If the operation te at 
the beginning of such code, then step 70 wOl provide the start symbol for insertion. Likewise, if a data bit is to be 
encoded, as per the table presented above, the corresponding code symbol type and code will provided. Step 71 stores 
into Insertion File 9 the symbol type and data code received from step 70 asweDasthe ping level from step 68. This is 
stored in association with a particular segment of, or location in, the sound waveform of the autfo product 
[0102] When specified conditions for symbol insertion have been met and a symbol is to be inserted, step 72 causes 
the spacing process and code sequencing to be updated. 

1MSEBIER 

[0103] The Inserter 1 1 reads the Input FBe 1 and writes a new WAVE audio file which is identical to the Input Ffle 1, 
except for short sections which are modified by symbol insertions made according to the information in the Insertion 
File 9. 

[0104] The Insertion FDe 9 provides the following information for each symbol: 

« Insertion time (given in fticks) 

• Insertion Gain (as an R M S. level) 

• Symbol type, and data bits to be coded. 

[0105] The Output FDe 13 must always have the same sample rate and format as the Input File 1 ; ag. if the input FDe 
Is 16-bit stereo at 48000 samples/sec., then the Output File has the same characteristics. 

[0106] A symbol is inserted into the original audio product by the two processes, Band Removal and Ping Adcfition, 
described En detail below. The nominal insertion time is the midpoint of the area in which the insertion is performed. 

Band Removal 

[0107] As shown in Fig. 4, a bandpass signal is derived by applying a bandpass FIR filter to the original signal from 
Input Rle 1 to remove frequencies outside the range 1000-4800 Hz approximately. This is done per step 80 by using a 
symmetrical, effectively non-causal fitter with zero phase response. This filter requires (N-1)/2 samples before and after 
the modeled section ol the signal where N Is the fitter length, which Is odd. 

[01 08] In step 84, a modified bandpass signal derived by step 82 (as explained below) is subtracted from the original 
signal over a period of 7T or 10T. Each T is 258 samples at 44100 samplesfeec., or about 1.451 msec. As shown tn 
Figs. 5 and 6, a 2-ping symbol occupies 5T and a 3 -ping symbol occupies 8T. The duration of signal represented by a 
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record 44 is 8T, or 1 1 . 6 msec 

[01 09] In step 82, first and last sections of the bandpass signal, of length T each, are tapered with a rased cosine 
window. The windowing operation ensures that no transients are introduced into the signal. Thus, as shown in Figs. 5 
and 6, 7T is used for the 2-ping symbol (ia for data type) and the 1 0T is used for the 3-ping symbol (Le. for start type 
and end type). Also, the bandpass signal is multiplied by BandRemoueFac, an empirically set parameter that is one of 
parameters 8. which is just less than 1. This causes the in-band signal to be almost entirely removed in the symbol 
insertion segment The retention of some contextual component of the audio product has been empirically determined 
to have a psychoacoustic advantage in masking the removal of the audio and the insertion of the symbol. The retained 
components should be in the range of 2% - 5% of the amplitude of the audio product in the symbol insertion segment 



[01 1 0] Once the band-removal is completed, a fixed "ping' waveform is scaled per step 86, the symbol is derived per 
step 88 and inserted per step 90 in the signal. Two or three pings (depending on the symbol type) are spaced 3T apart 
from each other and centered in the segment from which the band was removed. 
[0111] Scaling of the ping in step 86 is based on two factors: 

1 . Polarity: TJ' bits have negative polarity, whfle '1 ' bits have positive polarity. This information is obtained from Inser- 
tion FDe 9. The start symbol establishes a reference in case the process of transmission and reception inverts the 
signal. 

2. Magnitude: This is calculated based on PingQainMode using the total in-band power or the in-band envelope in 
the symbol insertion area, as per step 68 descrtoed above, to mask the audibility of the ping while maintaining its 
recoverability. The magnitude is given as an RMS level, averaged across a 3T period centered on the ping, and it 
Is obtained from Insertion FDe 9. 

[01 12] Step 88 derives the symbol from the symbol type and data code Information stored in Insertion File 9, and 
based on the scaled ping outputted by step 86. The result is inputted to step 90 which proceeds to perform the actual 
insertion, or embedding, of the symbol. 

[0113] Figs. 5 and 6 show a 2-ping symbol and a 3-ping symbol, respectively. The ping waveform Is derived from a 
template which has a width of 4T However, it is almost zero outside the 2T centre area, as shown. The ping waveform 
template will be discussed below in further detail. 

Bandpass Fitter 

[01 14] This rater is used for step 80. It is constructed using the method given above in the Mathematical Background 
section, with parameters as follows: 

• Passband from 950 Hz to 4850 Hz 

• KajserVWrxtow with parameter of 7 

• FIR Length: 7.78 msec. This is converted to a number of samples at the current sample rate. The result is rounded 
to the nearest integer. If the result is even, one is subtracted. 

Cosine Window Template 

[01 1 SI The template tor the cosine window of step 82 has tor its length a number of samples equivalent to T (1 .451 
msec), regardless of sample rata Thus, the number of samples varies with the sample rate. The template fades from 0 
to 1 based on the relationship 

[01 1 6] The template is used to fade the beginning and end of the band removal process. 



Ping Waveform 

[0117] This is constructed as a ban^ass FIR filter, using the method given above In the Mathematical Background 
section, wfth parameters as follows: 



Ping Addition 
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• Passband from 1485 Hz to 3980 Hz 

• Kaiser Window with parameter of 8.5 

• FIR Length: 5.804 msec. This is converted to a number of samples at the current sample rate. The result is rounded 
to the nearest integer. I! the result is even, one is subtracted.. 

5 

[0118] The resulting array of samples is scaled so that it has an R.M.S. power of 1, average over the middle "3T" 
(4.353 msec) portion of the WAVE ffla 

Fitter Plots 

TO 

[0119] Frequency -domain and time-domain response plots for the three linear filters used in the system will now be 
described 

[0120] The frequency response plots are in decfoets against Hz. All of the fitters are symmetrical linear phase FIR 
filters, meaning that they delay all frequencies squally, so no phase response is shown. The time-domain plots are 

15 impulse response waveforms. These show the output from the fflter, for an input consisting of a single input sample of 
value Tatt=0, aD other input samples being zero. The time scale is in milliseconds for aB plots. 
[0121 ] Fig. 7 shows the lime domain response of the fitter which Is applied to the input signal before rate conversion 
(see step 20). It removes frequencies which cannot be represented at a sample rate of 1 1025. Ftg. 8 shows the fre- 
quency response of the same fOter. 

20 [0122] The bandpass filter of step 24 is used in the beginning of the signal path ^ to extract the band of inter^L Rgs. 
9 and 10 show its frequency response and time domain response, respectively. 

[0123] The bandpass filter of step SQ Is used In the Locator to remove the portion of the signal In the band of interest 
Figs. 11 and 12 show its frequency response and time domain response, respectively. The length of the filter, as a 
number of taps, depends on the sample rate. The plots are for 48000 Hz. The shape of the graphs will change only 
25 slightly for other sample rates. 

Ping Waveform 

[0124] Figs, 13 and 14 show a signal template for the ping waveform which is used in the Insertion process, ft Is 
30 related to the filter response plots of Figs. 7-12 because band-pass fitter design techniques are used to generate the 
ping waveform. Its band-limited spectral content is important to the Invention because it must be within the bounds of 
the bandpass filter techniques described herein for creation and detection of the code. The Ping Waveform Spectral 
Content is shown In Ftg. 13. The Ping Waveform Time Domain is shown in Fig. 14. 

ss Adaptive M asking 

[01 25] Another important feature will now be explained in connection with Fig. 1. The data in Output FUe 13 gen erates 
audio sound, when reproduced through suitable equipment which is used by step 15 of Fig. 1 to perform a subjective 
audio quality tost based on psychoacoustic factors. Thus, testing process 15 could involve a group of human test sub- 
40 jects who listen to the audio and provide a descriptive feedback of audio quality, such as "I hear popping noises" or n l 
hear dropouts". The severity of each quality impairment Is also ascertained on a scale of, say, 1 to 5 with 5 being the 
worst 

[0126] Based on the derived feedback Information of (1) the nature of the impairment and (2) Its severity, a decision 
is made by step 1 7 regarcfing whether and how parameters 8 are to be modified. One embodiment of parameter control 
45 process 17 relies on a human operator who utilizes the feedback information and a set of empirically derived rules for 
modifying parameters 8. As an example of such rules, for popping noises assigned a value of 3 on the scale of 1 to 5, 
the PingQain is reduced by 10%. 

[01 27] Another embodiment automates process 1 7 by developing a lookup table using the feedback information from 
step 17 as inputs. Step 17 then outputs control signals to suitably vary the parameters 8. 
so [0128] The modified parameters 8 are used to embed the code as Locator 7 and Inserter 1 1 are performed once again 
to derive a new Output File 13. Test 15 and process 17 can continue to be applied itemtively for modifying parameters 
8 unto an Output File 13 is derived which generates audio that satisfies the assigned audio quality requirements. 

55 

[01 29] The decoder processes digital information derived from a received audio signal, including a waveform with an 
embedded code, typically it is picted up from an RF receiver. The analog signal is digitized to derive a Decoder Input 
File In WAVE format, comparable to Input FQe 1. 
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[0130] As shown in Fig. 15, the Encoded Audio Input File 1 18 is processed by step 120 which is analogous to step 
20 of Fig. 2. Step 124 of Fig. 16 is analogous to step 24 to provide a bandpass fata' output Operation 125 provides 
signal normalization, as explained below, and a normalized input slgnaJ is outputted by step 132. 
[01 31 ] Ping identification is performed by step 134, which isolates candidate pings from the band -limited, power-nor- 
malized signal. Each candidate ping is associated with a time point and an amplitude, both of which are determined via 
a template correlation process, as explained below. 

[01 32] In symbol identification step 1 36, the sequence of pings, identified in the first stage are examined to find pairs 
of pings, each pair of which forms a symbol. This is necessary because "false 0 pings occur with some regularity in the 
audio product source material rtseH. The decoder detects symbols by looking for pairs of pings with the correct relation- 
ship to each other. Once the symbols are identified, code assembly step 138 then reconstitutes the embedded code 
detected in the audio signal. 

[0133] Normalization operation 125 includes the various steps shown in Rg. 16. These are designed to match steps 
26-30 of Rg. 2 so that the signals outputted by the encoder and processed by the decoder are matched as well. More 
specifically, step 126 Is comparable to step 26. Step 128 is comparable to step 28. Step 1291s comparable to step 29. 
However, step 130 differs from step 30 in that step 30 includes inversion, but step 130 does not Therefore, the signal 
from step 1 30 is the inverse of the output from step 30. This inversion by the decoder constitutes an amplitude normal- 
ization which removes variations due to gain or attenuation experienced during sending, playback and receiving oper- 
ations for the aucfio product 

[0134] Details of steps 134, 136 and 1 38 wiD now be provided. 

[0135] Ping Identification step 134 is shown in detaB by Rg. 1 7. ft compares the normalized input signal derived by 
step 1 32 with a ping waveform template, as per step 150 in Rg. 1 7. Details of this template have been provided above. 
A circulating buffer stores as many samples of the normalized input signal as there are data samples En the ping wave- 
form template. Each sample of the input signal is compared wfth its respective template data sample. The RMS error 
for each sample comparison is calculated to derive an error function Ej and an overall scaling factor Kj per step 152, as 
described below. 

[01 36] The comparison algorithm finds the value of kj which minimizes the error function 



where tj is the fixed template and is the normalized input signal being analyzed. This can be expanded as 



[0137] Since tj is a fixed template, the denominator in the above is a constant which can be pre-calculated. 
[0138] Calculation of 



allows kj to be calculated. The polarity of K| determines the polarity off the received ping. Additionally, calculation of 





This is minimized by setting the derivative with respect to Kj to zero, resulting in: 
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allows Ej to be found. 

[Q1 39] The decoder performs this comparison to search for pings on an 1 1 025 Hz signal. The search is actually done 
to a resolution of 44100 Hz. This is done by using four separate templates (representing 1/4-sample shifts of the ideal 
ping template relative to 1 1025 Hz). At each Iteration, the template with the largest 



and, thus, the largest k and lowest E, is found. This provides a higher position resolution for beating pings, and a more 
accurate measurement of the ping's magnitude and closeness of match. 

[0140] Thus, the calculated value of K| minimizes the error E| between the input signal and the ping template. That 
same value of K( is used to scale the magnitude of the ping template, and Kf is regarded as the ping magnitude, 
[0141 ] In order for a ping to be recognized as a candidate and passed to the next phase, the following two conditions 
must be met 

• The error must be sufficiently small relative to the ping magnitude. The ping magnitude is divided by the square root 
of the error. This result Is effectively a "shape matching 0 metric; scaling the Input has no effect on this ratio. The 
ping is rejected, per step 154, if the ratio is less than parameter PingRatThresh (labeled "A" in Fig. 17 for conven- 
ience), which is dimensionfesa A high ping magnitude value with a low error value constitutes a good match . 

* The magnitude of the ping must be al least MlnPingMag (labeled "B" in Fig. 17 for convenience), as per step 156. 
This is in signal units. 

[0142] If the decisions in steps 154 and 156 are favorable, then step 1 58 outputs this portion of the input signal as a 
candidate ping for further processing by symbol identification step 136. 

[0143] Even though a portion of the input signal has been accepted as a candidate ping, it may in fact be an impostor. 
To aid in the determination of proper pings, the coding method uses a pairing ol pings to create symbols. The symbols 
that are inserted by the encoder have very clear and unique characteristics that make detection of false pings easier. 
[0144] Details of symbol identification are shown in Fig. 18. Test 160 compares the time between two consecutive 
candidate pings against a threshold C which ts less than the fixed time between consecutive pings (as explained 
below), rf the spacing is greater than the threshold, step 1 61 outputs the signal as a real ping to further processing in 
connection with symbol iderrtrfication. If the two pings occur too dose to each other for them to be a symbol pair, a test 
1 62 is performed to determine which ping is to be rejected. The magnitude of the first ping Is multiplied by the parameter 
"NewPingShadow" (railed "D" in Fig. 18 for convenience), which is normally set to 0.1 . If the amplitude of the second 
ping is less than the result ft Is discarded as a false ping, ff It is larger, then It Is considered par step 1 64 to be the start 
of anew symbol frame to be outputted as a ping by step 161. 

[0145] Pings that are Inserted as symbols wfll have the same amplitude at the time of insertion, as explained above 
in connection with the encoder. If they do not have the same amplitude during decoding, step 1 66 determines which of 
two consecutive pings is larger. The ratio of the amplitude of the larger ping to the smaller is compared by step 168 to 
the "MaxPingSkew" parameter (labeled "E" In Fig. 17 for convenience}. The result must not exceed the threshold set by 
this parameter ; otherwise, the pings are considered not to be a symbol pair. 

[0146] The spacing between ping pairs in a symbol is fixed at 4.354 msec The pacing between consecutive pings 
is calculated and compared in step 1 70 to this fixed spacing. The absolute discrepancy between the calculated spacing 
and the fixed spacing must not exceed the "MaxPingSpacDev" parameter (labeled "F in Fig. 18 for convenience). 
[0147] Symbol Insertion creates a "quiet" portion between pings. The power of the band-limited, normalized, signal is 
measured by step 1 72 within a one T time window about a point halfway between the candidate pings. In a true symbol, 
there is very little signal energy there. The product of the magnitude of the two pings is divided by the measured power 
signal. The result, as determined by step 174, must exceed the "MaxGapRatio'' parameter (labeled "G" in Fig. 18 for 
convenience) for the two pings to be considered a true symbol. If so, step 1 76 outputs the pings as a true symbol. 
[0148] The true symbols outputted by step 136 are then collected and the resultant code Is determined by code 
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assembly step 138 by applying the information set forth above in Table 7. 

[01 49] Representative values of the decoder parameters are provided in the following table. 



Table 8 



Parameter 


Value 


PlngRatThresh 


10 


MinPingMag 


200 


Threshold C 


3T-181 jisec 


NewPingShadow 


0.1 


MaxPingSkew 


ao 


MaxPingSpacDev 


181 jisec. 


MaxGap Ratio 


250 



[0150] Although specific embodiments of the invention have been described above in detail, it should be understood 
20 that various modifications thereto can readily be made by anyone with ordinary skfll in the art For example, operation 
of the encoder is not done in real time relative to, for example, live broadcast of an audio product However, if processing 
delay is of a fixed and acceptable amount then the encoding could be done in real time. Operation of the decoder is 
normally done in real time, implementation of the encoder and decoder can be in hardware (e.g. digital signal proces- 
sors) or software depending on the specific requirements and tolerances of the particular application. The locator func- 
25 tion has been described as performing the minimum spacing test first and then the masking test However, this 
sequence ran be inverted. Each record 44 stores two In -band envelope levels, but it is also contemplated that storing 
only one would suffice Ateo, rather than having parallel signal paths 22 and 34, these could be implemented to operate 
sequentially. All such medications are intended to fall within the scope of the invention, as defined by the following 
claims, 

so 

Claims 

1 • A method of embedding a digital code in a digitized audio product, comprising the steps of: 

ss filtering the digitized audio product to a frequency band of interest; 

determining a tonality Indication for each of a plurality of segments of the filtered audio product which indicates 
the extent to which power is distributed uniformly for frequencies in at least a portion of said band of interest; 
and 

inserting at least a portion of the digital code into a particular segment from said plurality of segments only if 
40 said tonality indication indicates a relatively uniform power distrflxrtion in said particular segment 

2. The method of claim 1 , wherein said inserting step is performed only if at least one of the segments immecfiately 
before and Immediately after said particular segment also has a tonality Indication which Indicates a relatively uni- 
form power distribution. 

46 

3. The method of claim 1 , wherein all of said plurality of segments have a uniform duration. 

4. The method of claim 1, further comprising the step of determining the total power for at leas! said particular seg- 
ment, and performing said inserting step only if said total power is above a predetermined threshold. 

so 

5- A method of embedding a digitized code in a digitized audio product comprising the steps of: 
filtering the digitized audio product to a frequency band of Interest; 

providing a coding signal derived from a band-limited impulse function with a waveform having its energy con- 
5f fined to and evenly spread across at least a portion of said frequency band of Interest; 

deriving said digitized code from said coding signal; and 
embedding said digitized code into said audio product 
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6. The method of claim 5, wherein said frequency band of interest is approximately 0 to 5000 Hz, ami said cocfing sig- 
nal has energy spread across approximately 1500 to 4000 Hz. 

7. The method of claim 5, wherein said digitized code is derived based on polarity of said coding signal. 

8. The method of claim 5, wherein each bit of said digitized code is derived based on a plurality of said coding signals. 

9. TTie method of claim 5, wherein said coding signal is derived from an deal ma th ema tical impulse function to which 
a steep bandpass filter is applied. 

10. A method of providing a digitized code to be embedded in a digitized aucfio product, comprising the steps of: 

provicfing said digitized code as a series of binary bits; 

dividing said binary bits Into groups, each group having a plurality of bits; 

providing cocfing signals to represent said bits, respectively; 

deriving a symbol from said coding signals for each of said groups, each symbol having a plurality of said cod- 
ing signals with a preset gating therebetween. 

1 1 - TTie method of claim 1 0. wherein each symbol has coding signals equal In number to bits in a group to which such 
symbol corresponds. 

12. The method of claim 1 0, wherein the coding signals are identical to each other in shape, with one binary bit corre- 
sponding to a coding signal of one polarity, and the other binary bit corresponding to a signal of the other polarity. 

13- TTie method of claim 1 0, wherein a symbol consists of two of said coding signals. 

14. TTie method of claim 10, wherein a symbol consists of three of said coding signals with equal spacing between 
adjacent ones. 

15. A method of encocfing and decoding a digitized code embedded in a digitized audio product comprising the steps 
of. 

deriving said digitized code in a form of start, data and end symbd types, each symbd representing a ^ 
of bits, and each bit being associated with a coding signal of given polarity; 

generating said start type of symbol to consist of a plurality of said coding signals all of which have the same 
designated polarity; 

embedding said digitized code in said digitized audio product; 
detecting said digitized code errbedded in said audio product; and 

decoding said detected digitized code by determining whether the polarity of the coding signals on said start 
type ol symbol is said designated polarity and, if not, inverting the polarity of said cocfing signals in said data 
and end types of symbols. 

16. A method of embedding a digitized code En a digitized audio product, comprising the steps of: 

identifying segments of the digitized audio product into which the digitized code can be embedded based on 
predetermined criteria; 

generating portions of the digitized code for insertion into said segments, respectively; 
removing the digitized audio product within said identified segments except for a predetermined small percent- 
age of amplitude to generate mocfifted segments; and 

inserting said portions of the digitized code Into said modified segments, respectively. 

17. A method of embedding a digitized code in a digitized audio product, comprising the steps of: 

analyzing the digitized audio product to derive measured values for designated characteristics thereof; 
locating segments of the digitized audio product, based on said derived measured values and a set of prese- 
lected parameters, into which the digitized code can be inserted so as to be masked; 
inserting the digitized code into said located segments; 

determining whether a degree of masking of the Inserted digitized code meets a predetermined level and, if 
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not, modifying values of at least one of said set of preselected parameters, and then performing said locating 
and inserting steps again with said modified values. 

18. A method of embedtf ng a digitized code in an audio product comprising the steps of: 

dividing said digitized code into preselected portions; 

representing said portions by a plurality of coding symbols, respectively: 

determining spacing of said coding symbols from each other to be used for embedding the digitized code within 
the audio product so that said spacing is greater than a predetermined minimum, and 
inserting the coding symbols within the audio product based on said determined spacing. 

1 9. The method of claim 1 8, wherein said predetermined minimum spacing for a present symbol is in relation to a loca- 
tion of a previous symbol. 

20. The method of claim 1 8, wherein said predetermined minimum spacing for a present symbol is in relation to a loca- 
tion of the irrtmecfiately prececfing previous two eymbote. 

21. The method of claim 18, wherein said spacing is derived to produce a random spacing. 

22. A method of decoding an audio product into which a code of cfigitized coding signals has been embedded, com- 
prising the steps of: 

obtaining a digitized audio product; 

comparing said digitized audio product with a template of a coding signal to identify candidate coding signals 
based on shape; 

comparing pairs of sequential candidate coding signals with each other based on preselected characteristics 

to identify which ones constitute the coding signals; and 

reconstructing said code from said coding signals identified by said comparing step. 

23. The method of claim 22, wherein said comparing step compares amplitude. 

24. The method of claim 22, wherein said comparing step compares a predetermined spacing with spacing between a 
sequential pair of said candidate coding signals. 

25. Apparatus for embedding signals representing a cfigitized code in an audio product, comprising: 

a bandpass ffter receiving the audio product and having an output in a frequency band of interest; 

a signal generator outputting a coding signal derived from a band-Omrted impulse function with a waveform 

having its energy confmed to and evenly spread across at least a portion of the frequency band of interest; 

means for deriving the signals for representing said digitized code from said coding signal; and 

means for embedding said signals representing the digitized code into said audio product 

26. The apparatus of claim 25, wherein said signals representing the digitized code are derived based on polarity of 
said coding signal. 

27. The apparatus of claim 25, wherein each bit of said digitized code is represented based on a plurality of said coding 
signals. 

28. The apparatus of claim 25, wherein said coding signal is derived from an ideal mathematical impulse function to 
which a steep bandpass filter is appGed. 

29. A method of providing a digitized code to be embedded in an audio product, comprising the steps of: 

representing said cfigitized code by a plurality of analog symbols to be embedded in the audio product; 
obtaining a pturaDty of data samples corresponding to a digitized audio product; 

filtering a portion of the plurality of data samples of the digitized audio product to a frequency band of interest; 
calculating an RMS envelope signal for said filtered portion of the digitized audio product: and 
controlling an amplitude level of said symbols En accordance with said calculated FWS envelope signal. 
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3a A method of decoding an audio product into which a code of coding signals has been embedded, comprising the 
steps of: 

obtaining a digitized audio product with the code of coring signals bang embedded therein; 

storing a digitized template of a coding signal as a given number of data samples; 

storing from one portion of the digitized audio product a number of data samples equal to said given number 

of data samples of said stored template; 

comparing corresponding data samples of the stored template and audio product with each other to determine 
an error function between said template and said stored portion of the digitized audio product; 
determining a scaling factor for the stored audio product which produces a minimum error from said error func- 
tion; 

determining amplitude for raid one portion of the audio product from raid scaling factor; and 

combining said minimum error with said amplitude to recognize whether a coring signal is included in said 

stored portion of the digitized audi o product 

31. The method of claim 30, wherein said step of combining comprises dividing the error by the amplitude to produce 
a result and said coding signal Is recognized if said result exceeds a given threshold. 

32. Apparatus for decoding an audio product into which a code of coding signals has been embedded, comprising: 

means for obtaining a digitized audio product with the code of coding signals being embedded therein; 
means for storing a digitized template of a coding signal as a given number of data samples; 
means for storing from one portion of the digitized audio product a number of data samples equal to said given 
number of data samples of said stored template; 

means for comparing corresponding data samples of the stored template and audio product with each other to 
determine an error function between said template and said stored portion of the digitized audio product; 
means for determining a scaling factor for the stored audio product which produces a minimum error from said 
error function; 

means for determining amplitude of the audio product from said scaling factor; and 

means for combining said minimum enror with said amplitude to recognize whether a coring signal is included 

En said stored portion of the digitized audio product 

33. The apparatus of claim 32, wherein said combining means comprises means for dividing the error by the amplitude 
to produce a result, wherein said coding signal is recognized if said result exceeds a given threshold. 

34. Apparatus for providing a signal representing a digitized code to be embedded in an audio product comprising: 

means for providing said digitized code as a series of binary bits; 

means for dividing said binary bits into groups, each group having a plurality of bits; 

means for providing coding signals to represent said bits, respectively; and 

means tor deriving a symbol from said coding signals for each of said groups, each symbol having a plurality 
of said coding signals with a preset spacing therebetween. 

35. The apparatus of claim 34, wherein each symbol has coring signals equal in number to bits in a group to which 
such symbol corresponds. 

36. The apparatus of claim 34, wherein the coding signals are identical to each other in shape, with one binary bit cor- 
responding to a coding signal of one polarity, and the other binary bit corresponding to a signal of the other polarity. 

37. The apparatus of claim 34, wherein a symbol consists of two of said coding signals. 

38. The apparatus of claim 34, wherein a symbol consists of three of said coding signals with equal spacing between 
adjacent ones. 

39. Apparatus for decoding an audio product Into which coding signals representing a digitized code have been 
embedded, comprising: 

means for obtaining a digitized audio product with signals representing said digitized code being embedded 
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therein; 

first means for comparing said digitized audio product with a template of a coding signal to Identify candidate 
coding signals based on shape; 

second means tor comparing pairs of sequential candidate coding signals with each other based on prese- 
lected characteristics to identify which ones constituted the coding signals; and 
means tor reconstructing said code from said coding signals identified by said second comparing step. 

4a The apparatus of claim 39, wherein said second comparing means compares amplitude. 

41. The apparatus of claim 39, wherein said second comparing means compares a predetermined spacing with spac- 
ing between a sequential pair of said candidate coding signals. 

42. Apparatus for embedding a digitized code in an audio product, comprising: 

means for identifying segments of the audio product into which the digitized code can be embedded based on 
predetermined criteria; 

means for generating coding signals representing portions of the digitized code for insertion Into said seg- 
ments, respectively, 

means for removtog the audio product within said identified segments except for a predetermined small per- 
centage of amplitude to generate mocfified segments; and 

means for inserting said coding signals representing portions of the digitized code into said modified seg- 
ments, respectively. 

43. Apparatus for embedding a cfigftal code In an audio product, comprising: 

means for filtering the audio product to a frequency band of Interest; 

means tor determining a tonality indication for each of a plurality of segments of the filtered audio product 
which indicates the extent to which power is distributed uniformly for frequencies in at least a portion of said 
band of Interest; 

means for representing said digital code with symbols; and 

means for inserting at least one of said symbols is not a particular segment from said plurality of segments only 
if said tonality indication indicates a relatively uniform power distribution in said particular segment. 

44. Apparatus tor embedding a digitized code in a digHized audio product comprising: 

means for analyzing the audio product to derive measured values for designated characteristics thereof; 
means tor locating segments of the aucfio product, based on said derived measured val ues and a set of prese- 
lected parameters, Into which signals representing the digitized code can be inserted so as to be masked; 
means for inserting said signals representing the digitized code into said located segments; and 
means for determining whether a degree of masking of the inserted digitized code meets a predetermined level 
and, if not, modifying values of at least one of said set of preselected parameters, and then performing said 
locating and inserting steps again based on said modfted values. 

45. Apparatus for encoding and decoding a digitized code embedded in an audio product, comprising: 

means for representing said digitized code by symbols In a form of start data and end symbol types, each of 
said symbols representing a plurality of bits, and each bit being associated with a coding signal of given polar- 
ity; 

means for generating said start type of symbol to consist of a plurality of said coding signals all of which have 
the same designated polarity; 

means for embedding the symbols representing said Digitized code in said audio product; 
means for detecting said symbols embedded in said audio product 

means for determining whether the polarity of the coding signals for said detected start type of symbol Is said 
designated polarity and, if not inverting the polarity of said coding signals in said data and end types of sym- 
bols; and 

means for decoding said digitized code based on the coding signals produced by said determining means. 

46. A method of providing a digitized code to be embedded In an audio product, comprising the steps of: 
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representing said digitized code by a plurality of analog symbols to be embedded in the aufio product; 
applying a frequency transform to said audio product to obtain a transform signal witNn a frequency band of 
interest; 

calculating power of the transform signal in each of a plurality of frequency groups, wherein each of said fre- 
5 quercy groups includes a portion of said band of interest; and 

controlling an amplitude level of said symbols in accordance with said calculated power. 

47. The method of claim 46, wherein said amplitude level of the symbols is the RMS level. 

w 48. The method of claim 46, further comprising the step of obtaining a total power from the powers calculated tor each 
of said frequency groups, and wherein said corrtroffirtg step irtifizes said total power. 

49. A storage mecfium storing processor implementable instructions for controlling a processor to carry out the method 
of any one of claims 1 to 24, 29 to 31, and 46 to 48. 

75 

5a A storage medium storing an audio signal having a digital code embedded therein by the method of any one of 
claims 1 to 9 or 16 to 21. 

51 . An audio signal having a digital code embedded therein by the method of any one of claims 1 to 9 or 1 6 to 21 . 

20 

52. An electrical signal carrying computer Implementable instructions for controlling a processor to carry out the 
method of any one of claims 1 to 24. 29 to 31 and 46 to 48. 
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